64 research outputs found
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
Metric-Scale Truncation-Robust Heatmaps for 3D Human Pose Estimation
Heatmap representations have formed the basis of 2D human pose estimation
systems for many years, but their generalizations for 3D pose have only
recently been considered. This includes 2.5D volumetric heatmaps, whose X and Y
axes correspond to image space and the Z axis to metric depth around the
subject. To obtain metric-scale predictions, these methods must include a
separate, explicit post-processing step to resolve scale ambiguity. Further,
they cannot encode body joint positions outside of the image boundaries,
leading to incomplete pose estimates in case of image truncation. We address
these limitations by proposing metric-scale truncation-robust (MeTRo)
volumetric heatmaps, whose dimensions are defined in metric 3D space near the
subject, instead of being aligned with image space. We train a
fully-convolutional network to estimate such heatmaps from monocular RGB in an
end-to-end manner. This reinterpretation of the heatmap dimensions allows us to
estimate complete metric-scale poses without test-time knowledge of the focal
length or person distance and without relying on anthropometric heuristics in
post-processing. Furthermore, as the image space is decoupled from the heatmap
space, the network can learn to reason about joints beyond the image boundary.
Using ResNet-50 without any additional learned layers, we obtain
state-of-the-art results on the Human3.6M and MPI-INF-3DHP benchmarks. As our
method is simple and fast, it can become a useful component for real-time
top-down multi-person pose estimation systems. We make our code publicly
available to facilitate further research (see
https://vision.rwth-aachen.de/metro-pose3d).Comment: Accepted for publication at the 2020 IEEE Conference on Automatic
Face and Gesture Recognition (FG
MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation
Heatmap representations have formed the basis of human pose estimation
systems for many years, and their extension to 3D has been a fruitful line of
recent research. This includes 2.5D volumetric heatmaps, whose X and Y axes
correspond to image space and Z to metric depth around the subject. To obtain
metric-scale predictions, 2.5D methods need a separate post-processing step to
resolve scale ambiguity. Further, they cannot localize body joints outside the
image boundaries, leading to incomplete estimates for truncated images. To
address these limitations, we propose metric-scale truncation-robust (MeTRo)
volumetric heatmaps, whose dimensions are all defined in metric 3D space,
instead of being aligned with image space. This reinterpretation of heatmap
dimensions allows us to directly estimate complete, metric-scale poses without
test-time knowledge of distance or relying on anthropometric heuristics, such
as bone lengths. To further demonstrate the utility our representation, we
present a differentiable combination of our 3D metric-scale heatmaps with 2D
image-space ones to estimate absolute 3D pose (our MeTRAbs architecture). We
find that supervision via absolute pose loss is crucial for accurate
non-root-relative localization. Using a ResNet-50 backbone without further
learned layers, we obtain state-of-the-art results on Human3.6M, MPI-INF-3DHP
and MuPoTS-3D. Our code will be made publicly available to facilitate further
research.Comment: See project page at https://vision.rwth-aachen.de/metrabs . Accepted
for publication in the IEEE Transactions on Biometrics, Behavior, and
Identity Science (TBIOM), Special Issue "Selected Best Works From Automated
Face and Gesture Recognition 2020". Extended version of FG paper
arXiv:2003.0295
Plug-and-Play SLAM: A Unified SLAM Architecture for Modularity and Ease of Use
Nowadays, SLAM (Simultaneous Localization and Mapping) is considered by the
Robotics community to be a mature field. Currently, there are many open-source
systems that are able to deliver fast and accurate estimation in typical
real-world scenarios. Still, all these systems often provide an ad-hoc
implementation that entailed to predefined sensor configurations. In this work,
we tackle this issue, proposing a novel SLAM architecture specifically designed
to address heterogeneous sensors' configuration and to standardize SLAM
solutions. Thanks to its modularity and to specific design patterns, the
presented architecture is easy to extend, enhancing code reuse and efficiency.
Finally, adopting our solution, we conducted comparative experiments for a
variety of sensor configurations, showing competitive results that confirm
state-of-the-art performance
- …